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, ^ Abstract 

\^ We study sequential Bayesian inference in continuous-time stochastic kinetic models with 

T-H latent factors. Assuming continuous observation of all the reactions, our focus is on joint 

inference of the unknown reaction rates and the dynamic latent states, modeled as a hidden 
Markov factor. Using insights from nonlinear filtering of jump Markov processes we develop 
a novel sequential Monte Carlo algorithm for this purpose. Our approach applies the ideas 
of particle learning to minimize particle degeneracy and exploit the analytical jump Markov 
structure. A motivating application of our methods is modeling of seasonal infectious disease 
outbreaks represented through a compartmental epidemic model. We demonstrate inference in 
such models with several numerical illustrations and also discuss predictive analysis of epidemic 
countermeasures using sequential Bayes estimates. 
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^ 1 Introduction 

O 

Stochastic jump-Markov models have become ubiquitous in multiple application areas, including 
systems biology, molecular chemistry, epidemiology, queuing theory and finance. An important 
_ ^ class of such systems is described using the chemical reaction system paradigm, which classifies 

^ jumps in terms of a finite number of possible reactions. System transitions are specified prob- 

^ abilistically in terms of the distributions of the inter-reaction periods and next-to-fire reaction 

type. The reaction rates depend on the current system state and impose a Markovian structure. 



termed a stochastic kinetic model (SKM) [Wilkinson 2006 



While the basic setup assumes time-stationarity, in many contexts time-dependence, seasonality, 
and other regime shifts of reaction rates are crucial. A popular way to incorporate stochastic shifts 
in the environment is through Markov modulation, i.e. introduction of an additional (latent) 
dynamic factor that affects transition rates. We christen such systems Hidden Markov Stochastic 
Kinetic Models (HMSKM). 

Usage of an HMSKM in an application requires statistical estimation of the reaction rates and 
environmental factors. In the present paper we are concerned with Bayesian inference which allows 
unified filtering of latent system states and parameters and full quantification of the posterior 
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uncertainty. Moreover, anticipating dynamic optimization and decision making applications, we 
are interested in sequential inference. As a motivating example of such HMSKM inference, we 
describe below a stochastic model of seasonal epidemics of infectious diseases. Here the SKM 
paradigm is used to give a mechanistic description of outbreak progression using a compartmental 
description of the population, while the latent factor represents environmental factors affecting the 
outbreak (such as new genetic shifts in the pathogen or weather patterns). Sequential inference 
of the reaction rates (infectiousness, etc.) and dynamic seasonality determines outbreak severity 
and is the central problem in biosurveillance and corresponding public health policy-making. 
Bayesian inference in SKMs is typically accomplished through Markov chain Monte Carlo 



(MCMC) methods [Golightly and Wilkinson 


2006 


Boys et al 


2008 


Niemi 


2009 


Golightly and 


Wilkinson 


2011 and has been also used for related epidemics inference problems 


O'Neill 


2002 



MCMC analysis after every new data point is obtained and therefore is not computationally effi- 
cient. A more suitable and flexible alternative is to apply sequential Monte Carlo methods (SMC), 



also known as particle filters, which use an empirical selection-mutation mechanism iDoucet et al 



2001, Cappe et al 2005]. The main drawback of SMC is particle degeneracy which becomes 
especially serious during estimation of constant parameters. A recent class of particle learning 



methods iCarvalho et al 2011, Dukic et al 2010 has been designed to overcome these challenges 



by exploiting additional analytical structures. Starting with these ideas we develop new SMC 
methods targeting continuous-time stochastic kinetic models, demonstrating the efficiency and 
tractability of Bayesian inference in this novel context. 

The aims of this work are three-fold. First, we wish to draw the attention of the computational 
statistics and stochastic simulation community to the convenient analytic structure of hidden 
Markov stochastic kinetic models. This structure allows for highly efficient Bayesian sequential 
inference algorithms in rather general settings, which to our knowledge supercede previous results 
in this direction. Second, we provide a new extension of the particle learning SMC method. In 
contrast to existing literature, we work in continuous time and further consider a hidden Markov 
factor. Thus, to compute the predictive and conditional likelihoods we use tools from stochastic 
filtering (namely filtering of doubly-stochastic Poisson processes) which have not been hitherto 
used in this context. Our results demonstrate the wide applicability and attractiveness of particle 
learning in jump-Markov models. 

Finally, we extend the burgeoning body of literature on sequential inference in compartmental 
epidemiological models. Precisely, we propose a new SIR- type model of seasonal epidemics with 
a stochastic seasonality factor. The model allows for tractable online inference assuming full 
observation of all epidemiological events and can be used as a testbed to analyze alternative ways 
of epidemic control under imperfect information. 

In the next section we provide more background on our main motivating application in epidemic 
modeling; the rest of the paper is organized as follows. Section [2] provides a general setup of a 
hidden Markov stochastic kinetic model; the resulting inference problem and the particle learning 
algorithm are constructed in Section [3| Section |4] then illustrates the results and provides numer- 
ical examples for a simple SIS model of seasonal epidemics. Section [5] then discusses predictive 
analysis of outbreak countermeasures in the developed sequential framework. 



1.1 Seasonal Epidemics 

Probabilistic modeling of infectious disease epidemics is an important tool in public health anal- 
ysis. Stochastic models provide a tractable way to quantify the uncertainty about epidemic 
dynamics and carry out predictive analysis on the future path of the outbreak. Public health 
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Figure 1: Percentage of Influenza Like Illness (ILI) outpatient visits in Santa Barbara county in 
2004-2011 based on weekly sentinel providers data collected by Santa Barbara County Department 
of Public Health Bellomy 2011 . No data is normally collected during the summer. 



agencies increasingly rely on such techniques to compare alternative courses of action and mount 
an efficient policy response through behavioral or biomedical interventions. 

Recurring seasonal epidemics, with a prime example of influenza, provide some of the best 
testbeds for quantitative analysis thanks to availability of time-series data of previous outbreaks. 
Figure [T] shows the ILI flu incidence statistic in Santa Barbara county in California over the 
past seven years. ILI or influenza- like illness is a formal description of symptoms that typically 
occur in infected individuals, and is the most common proxy for measuring the incidence of flu 
in the population. The figure illustrates the apparent fact that flu is generally more prevalent 
during winters, due to emergence of new virus strains and colder weather making susceptibility 
higher. Nevertheless, significant variation in flu incidence cases and timing of outbreaks can be 
observed year over year. A case in point was the 2008-09 season when the new HlNl influenza 
strain caused a world-wide pandemic peaking in early Fall 2009, scrambling the traditional flu 
calendar of public health action^ To summarize, Figure [l] demonstrates three main features of 
flu outbreaks: (i) strong seasonality; (ii) year-to-year variability; and (iii) stochastic fluctuations 
during each outbreak. Other endemic infectious diseases with similar patterns include rotavirus, 
norovirus, measles, dengue fever, and cholera [Crassly and Fraser||2006 



Full understanding of community epidemic dynamics remains elusive given the paucity of avail- 
able data. While issues such as inhomogeneous mixing, age effects, and spatial interactions are 
undoubtedly crucial, their mathematical and statistical modeling requires large-scale computa- 
tional and modeling efforts Halloran et al 2008 . Alternatively, mechanistic models of outbreak 



provide a simplifled but highly tractable paradigm of describing outbreak progression that can 
be calibrated to real data. A popular mechanistic approach is given by the class of stochastic 
compartmental models Andersson and Britton 20001 . Thus, the population is partitioned into 



several classes of individuals based on their epidemiological status, such as Susceptible, Infected, 
etc., and the outbreak is described on the macroscopic level in terms of transition rates Q among 



'^The usual H3N1 strain was also present that year and hence the time series effectively combines two distinct 
outbreaks 
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the compartments. This SIR framework has been successfully used in modeling a range of infec- 
tious diseases, ranging from influenza Merl et al|2009 to measles Cauchemez and Ferguson |2008 



and foot-and-mouth disease [Jewell et al 2009 . Probabilistically, this approach corresponds to 
imposing a Markovian structure at the group level and captures the intrinsic uncertainty through 
the stochasticity of the transitions taking place. 

To model the observed strong seasonality of influenza outbreaks, we introduce a further level 
of uncertainty through a stochastic seasonal factor {Mt}. Thus, the transition rates between the 
compartmental classes are modulated by {Mt}, which can be interpreted as the seasonal presence 
of a new pathogen. This seasonal factor is evidently an abstract object, i.e. latenl|^ We note 
that its effect is indirect, since {Mt} only affects the rates of the transitions, not transitions 
themselves. We refer to LeStrat and Carrat 1999 , Martmez-Beneito et al [2008 for related 



hidden Markov model representations of epidemics. There is further large literature on seasonal 
forcing in epidemics that focuses on deterministic SIR models using tools of dynamical systems 
[Keehng et al|200H [Pushoff et al||2004| [Stone etal,2007] or multi-scale analysis ^Kuske et al,2007] . 

Bayesian inference in compartmental models consists of estimating the transition rates between 
classes and the dynamic size of each class. This is by now a classical problem in biosurveillance; 
for example, a whole strand of literature is devoted to estimating the basic reproductive ratio 



TZ which is the single most important parameter for predicting epidemic impact [Ball and Neal 



2002, Cintron- Arias et al 2009, O'Neill 2002 Chowell et al 2009 . The complementary inference 



of latent states is treated in the aforementioned LeStrat and Carrat 1999[, Martmez-Beneito et al 



2008 



However, little research has been done for joint parameter and state estimation due to the 
associated computational challenges inherent in both Markov chain and sequential Monte Carlo 
approaches. In this paper we present a novel algorithm for such joint inference of {Mt} and 
outbreak parameters Q using sequential Monte Carlo. 

Beyond pure inference, the ultimate objective of decision-makers is to mitigate the epidemic 
impact. This is achieved by implementing response policies including vaccination, quarantine, 
pharmaceutical or hospital treatment, information campaigns, etc. Since any policy involves 
budgetary or human resources, a balance is needed between costs due to epidemic morbidity 
and mortality and costs arising from policy actions. Moreover, to be optimal, a policy must be 
adaptive, i.e. rely on the latest collected information; thus, policy analysis is inherently linked 
to sequential inference. Quantitative approaches to such dynamic epidemic management include 
continuous-time Markov chain models [Merl et"al 2009 , Markov decision processes Tanner et al 
2008 , systems of ordinary differential equations Chowell et al'2009], agent-based representations 
[Halloran et al|[2008j and stochastic control [Ludkovski and Niemij,2010j . However, existing meth- 
ods are limited in adequately addressing the questions of unknown system dynamics, parameters 
and states. Below we demonstrate that regime-switching compartmental models in fact provide 
a flexible paradigm for analyzing these issues by allowing for accurate sequential inference. 



2 General Setup 

We consider a d-dimensional continuous-time jump-Markov process {^t}, Xt G N"^ on a proba- 
bility space (0, J^, P), that evolves according to laws of a chemical reaction system. Namely, Xt 
denotes the (non- negative) number of species of class k = 1, . . . ,d and there are Q reaction types 
with corresponding stoichiometry vectors Ag G Z'', q = 1,2, . . . , Q. The A^'s indicate the impact 
of a reaction on Xt: denoting by Tfc, k = 1, . . . , the reaction times, i.e. the jump times of {^t}, if 



As stated by Crassly and Fraser [2006 "despite the near ubiquity of this phenomenon [seasonahty] , the causes 



and consequences of seasonal patterns of incidence are poorly understood" 
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the k-th reaction is of type q, then X,-^ = -^r^- + ^q- Between reactions {Xt} is constant. 

A convenient representation of {Xt} is via a multivariate marked point process X = {Tk,Rk), 
k = 1,2, . . . , where 

Tfc := inf{t > Tk-i : Xt ^ Xt-}, k>l, with tq = 0, 

are the transition epochs, and 

Rk:=Xr^-Xr,_,e{Ai,...,AQ}, k>l, 

are the corresponding reactions which may be canonically identified with reaction types {1, 2, . . . , Q}. 
It is immediate that 

Xt = Xo + Y, Rk, (1) 

Tk<t 

showing the equivalence of the two formulations. We also introduce the counting processes 
N!:=J2hRk=A,} and Nf'' := hR,=A„M^^=i} 

Tk<t Tk<t 

for the number so far of each reaction by type, so that 

Q 

Xt = Xo+J2^t ■\- 

q=l 

The probability triple {Q, T , P) also supports a finite-state jump process {Mt\ that represents 
the modulating factor. Intuitively, Mj G {1, . . . ,X} is a Markov chain with generator 

GM:=(/Xy), i,ie{l,...,X}. (2) 

Slightly more generally, we take Gm = GM{t, Xt), allowing the transition rates of {Mt} to depend 
on time and the current state of {Xt}, so {{Xt, Mt)} is jointly Markov. 

To complete the description of {Xt} and {Mt} we finally specify the transition rates a of {-^t}, 
or equivalently the arrival rates of the counting process iV?'*. We assume that the corresponding 
propensity functions are of the form 

ag{t, Xt, Mt) = eg{Mt) ■ hg{t, Xt) , 9=1,..., Q, 

where B = {9i{l),9i{2), . . . ,6q{I)} are the reaction rates, modulated by {Mt}, and /I's are the 
mass action laws. Let 

Q 

a{t, Xt, Mt) := ag{t, Xt, Mt) (3) 

q=l 

be the total current arrival rate, and 

■""^ ' *' a{t,Xt,Mt) ^ ' 

be the conditional likelihood of the next reaction being of type q. 
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Due to the local Markovian structure, the distribution of the point process (r^, R^), conditional 



on a path of {Mt} is given explicitly by Amrein and Kiinsch 2012 



p{ti,Ri, . . . ,Tn,Rn\Ms,s < t) = exp ^- j a{s,Xs,Ms)ds^ 



k=l \q=l 



} ' 



(5) 



where the first term accounts for the total arrival intensity on [0, t] and the second term for the 
likelihoods of the observed event types. Recall that if Mg is constant and a's are independent of 
t, then the inter-arrival times are exponentially distributed. 



3 Inference 

Statistical inference in the presented framework consists of estimating the reaction rates and the 
seasonal factor {Mt}. As already mentioned, throughout we assume that {Xt} is fully observed, 
i.e. by any date t, a full record of all reaction times before t and corresponding reaction types 
is available. To explain our method, we shall consider the following three cases: 

(a) G unknown; {Mt} observed; 

(b) G known; {Mt} unobserved; 

(c) Both G unknown and {Mt} unobserved. 

Case (c) is the main object of interest in our study; however to understand its properties, in the 
next sections we briefly review Cases (a) and (b). 

Probabilistically, the three cases are distinguished by the different filtrations of observed infor- 
mation. Let {Gt), 

Gt = := a{Xs, M, : < s < t) V a{e), (6) 

denote the full filtration and using obvious notation let {J-f''^^), {J-f'''^), and (J-t^) denote the sub- 
filtrations corresponding to cases (a), (b), (c), respectively. Then our aim is to compute the (joint) 
posterior distribution Zt := p{Mt,Q\J-t) of the seasonal factor and the parameters for the above 
choices of filtrations. For later use we also define the posterior probabilities := [Hj, ■ ■ ■ , ) 
where 

nj=p-(Mt = i|j-f), ie {!,...,!}, (7) 

where P'^ denotes the conditional probability measure given the prior distribution Ho = vr of Mq. 
3.1 Conjugate Inference of Epidemic Parameters 

If {Mt} is observed (i.e. {J-^'^) is available), then {Xt} forms a time-inhomogeneous Markov 
chain. In particular, a full description and analysis of {Xt} is possible on the intervals [o"£, o"^+i] 
where {at) are the transition times of {Mt}: a^+i = inf{t > ai : Mt / Mt-}. 



6 



It is well known that conjugate Bayesian updating of is available using Gamma priors Boys 



et al||2008 Amrein and Kiinsch 2012 . Specifically, let 

where a. and h. denote the shape and rate parameters respectively of the Gamma distribution, be 
the independent priors of dq. Then given F^'^^ we can express the full data likelihood as follows: 

X Q 

(8) 



1=1 q=l 



h\{t):=h\+ hq{s,Xs)l{Ms=i}ds, 
Jo 

where we recall that N^'^ is the number of observed reactions of type q during the z-regime. 



We summarize those sufficient statistics as st = {aq{t) , bq{t)) , i 
denote the updating by the function S{-) from Q , i.e.. 



1,...,X, 



1, . . . , Q and 



ST = S{st,t,T,M.) := lal{t) + N;};'-N^'\bl{t)+ / X„ M,)l|A/,=i} 



q=l 



(9) 



Relation ^ provides an explicit sequential way to update the posterior of G along the trajectory 
of{{Xt,Mt)}. 

3.2 Estimation of the Latent Seasonal Factor 

In case (b), we assume that all rates Q are known and a full record of {Xt} is available, but the 
path of {Mf} is unobserved. Intuitively, inference of {Mf} is based on comparing the likelihoods 
of observed event epochs/types conditional on the possible values of the seasonal factor, using 
Using the representation of {Xt} as a state-dependent marked doubly-stochastic Poisson process 
implies that Ht can be described in closed-form and possesses piecewise-deterministic dynamics. 
Applying [Ludkovski and Sezer|[20T2 Prop 2.1] we obtain the following characterization 

Proposition 1. The sample paths of {Ut} follow 

Ut = X{t - Tk,Ur^,Xr^), Tk <t < Tk+l,k £N 

afifc(T-fc,X^,-,On'^fe- 



ni 



1, 



where the vector field x{t,TT,X) is defined via 

'(ti >t,Mt = i\Mo ~ vf, Xo = X) 



x'{t,n,X) 



(ri > t\Mo ~ 7f,Xo = X) 



(10) 



(11) 



i G {1, . . . ,X}, and has the explicit solution P(ti > t, Mt = i\MQ ~ vf, Xq = X) = 7r-exp(Jp Gm{s, X) 
A{s,X)ds) with A{s, X) := diag(a(s, X, •)) and 



A{s,X)i 



a{s, X, i) ifi = i'; 
otherwise. 



(12) 



Proposition [T] completely identifies the distribution of the X-dimensional posterior 11^ of 
through the recursion (10). 
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3.3 Joint Sequential Inference using SMC 

We now turn to our main case (c) where we only have access to (J^i^), so that neither {Mt} 
nor Q is observed. In that case, even if ah transitions of {Xt} are fuUy observed, the posterior 
distribution Zt = p{Mt, & \J-^) no longer admits any sufficient finite-dimensional statistics. Hence, 
no closed-form analysis is possible and we turn to constructing efficient numerical approximation 
schemes. 

The sequential Monte Carlo approach to recursively update Zt consists of constructing a particle 
approximation 



where 5 is the Dirac delta function and each of the J particles is defined by its (normalized) 



weight ifp^ > 0, its M- location m[''^ and its parameter versions 6^^\ In other words. 



P((Mi,G)eA|J-f):^ ^ 



w 



U) 



The particles are updated using a propagation-selection scheme. However, since the parameters 
are fixed throughout, it is well known that naive SMC implementation typically leads to par- 
ticle degeneracy, namely the diversity of 6^^^ 's across the particles increasingly diminishes due to 



resampling. As we shall see in Section 4.2, this issue is acute for HMSKMs. 

The availability of the sufficient statistics Sf from Q for Q conditional on {Mt} allows to 
dramatically reduce degeneracy. We recall the early approach of Storvik |2002| who applied a 



basic bootstrap particle filter Gordon et al 1993 on the pair {Xt,St) 
observations is evaluated by sampling from the posterior of 0, 



where the likelihood of 



I.e. 



p{Xt\Mt,St)^p{Xt\Mt,9'^^^) 0(^')~st i.i.d. 



(13) 



3.4 Particle Learning in HMSKM 



In our case, even more efficiency can be achieved through a resample-move SMC [Gilks and 
Berzuini|2001 which takes advantage of the explicit predictive likelihood of X given (see (15) 



below) and leads to a version of the particle learning (PL) framework originally proposed within a 

The use of a resample-move algorithm allows 
Thus, the filtered distribution at 



discrete-time setting in Carvalho et al 2010, 2011 



direct sequential filtering of {Mt,St) rather than the static 0. 
time t is approximated by the particle cloud 



2t 



r(J) 



(14) 



where s'f^ are again the parameter sufficient statistics in ([9]). Thus, the conditional distribution 
of 0, /e('|st), is a product of independent Gamma's and the overall Z^"^^ represents the posterior 
of each Oa mixture of Gamma distributions. 
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Require: Priors Sq, Hq, initial state Xq, number of particles J 

1: Sample ttiq''^ ~ Ho i.i.d., set Sq^ <— Sq, j = 1, . . . , J 

2: loop { for = 0, 1, . . . ,} 

3: Sample 6^'^ ~ p{0\si{^), j = 1,...,J 

4: Calculate weights w'^^l^ oc p{Tk+i — tj,, Rk+i\ml^J, 9^^^) 

5: for j = 1, . . . , J do 

6: Re-sample j' oc tf^li where j' G {1, . . . , J} 

7: Sample a trajectory m|:^^ ^^^^j using the conditional law p{M.\m[^J , 9^^ \ti^+i, Rk+i) 

8: Update 4+1 ^ 5(4''\Tfe,rfc+i,mg_^^^^]) 
9: end for 
10: end loop 

Algorithm 1: Particle Learning for Hidden Markov Stochastic Kinetic Model 



Algorithm [T] summarizes in pseudo-code the steps of the proposed particle learning algorithm. 
Its main steps are computing particle weights in (4) for resampling using the predictive likelihood 
of the next event, forward propagation step (6) using the conditional law of the environment factor, 
and updating of the sufficient statistics step (7) for the parameters. Overall, besides the analytical 
results detailed below, only the ability to simulate {Mt} and {Xt} is needed to implement the 
above Monte Carlo scheme, highlighting the flexibility of PL. A basic simulation method for SKM 
that is exact and can always be used is the Gillespie algorithm (see e.g. Wilkinson 2006] ), here 
slightly extended to take into account additional transition times of {Mt}. 

To calculate the predictive likelihood of the next inter-arrival interval r^+i — Tk and reaction 
type Rk+i conditional on Mr^. and parameters we rely on the analytic expression in ([sj), 

X 

p{Tk+l -Tk,Rk+l\Mrt,,Q) =^p{Rk+l\Mrf,^^ = ^, ©) X p(rfe+i -Tk,Mr^^^ =i\Mr^,Q). (15) 

i=l 

The first term on the right-hand-side is /^(rfe+i, X^-^..^^-, Mt-j.^J using the parameters 0, and the 
second term is 

P(rfc+i -Tfc = t,M,,^^ = i\Mr^ = i\Q) := Pi,i{t)a{Tk+i, Xr,^^^,i) (16) 

with Pi'i{t) being an element of the matrix exponential P{t) = ^fo ^M(s,x)-A{s,x)ds ^ (12). 
When there are just two latent states, |X| = 2, P(t) can be computed explicitly using eigenvector 
decomposition. 

The conditional law of {Mt} given t^+i — Tk,Rk+i is not available in closed form. However, 
using Bayes rule 

^'(^(T-fc,rfe+i]l^rfc, Srfe, 0, Tk+1, Rk+l) 

OC p{Tk+l - Tk, Rk+l\M^r„r,+,],@)p{M^r,,r,+,]\Mr,) (17) 



where 



P{rk+1 - Tk,Rk+l\M^r„r,+,],&) (18) 

= exp|-y" a{s,Xr^:,Ms)ds^ ■ aR^^^{Tk+i, Xr^, Mr^^^^) 
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and |Mt-j.) is determined from the transition matrix of {Mt}. To impleme nt (|17[ ) we 

use a rejection sampling step relying on the fact that there is an easy upper bound of (|18|) : 

G) 

< exp{(rfc - Tfc+i) mma{s,Xr^,i) } ■ maxoR (rk+i, Xr^,i). (19) 
Thus, we simulate using the unconditional law ■p[lv^(^^^^^^Mr^}) and then accept the simulation 



with probability given by the ratio between ( 18 ) and ( 19 ). Since are typically tightly spaced, the 
above conditional likelihoods are all close to each other, requiring only a few additional simulations 
(acceptance probability is usually > 95%). 

Finally, as already mentioned, the sufficient statistics for B are always conjugate- Gamma with 
the updating given explicitly in Q. We note that for typographical convenience in Algorithm [T] 
resampling takes place at each reaction time r^. In practice, any other resampling frequency can 
be chosen; in that case the weights are updated accordingly until resampling takes place. Also, 



a variety of resampling schemes (multinomial, residual, stratified, etc.) are available [Cappe et al 



2005 and can be used to lower Monte Carlo error. 

Like all SMC methods, PL still exhibits sample impoverishment which implies that the filtering 
error H-Z^"^^ — (in an appropriate metric) grows exponentially in T. Thus, exponentially more 
particles are needed to control the Monte Carlo error in terms of the number of observations. We 
therefore recommend utilizing PL on a fixed horizon T as in our application below. 



4 Model of Seasonal Epidemics 

We now return to our main example of a Markov-modulated chemical reaction system — a 
compartmental model of seasonally-forced endemic diseases. As a simple example, we shall analyze 



a classical stochastic SIR-type model ( 20 ) of epidemics that incorporates a latent seasonal factor 
{Mj}. For concreteness, we phrase our discussion in terms of the human influenza virus. 

A basic description of an endemic disease such as flu can be provided using an SIS compart- 



mental model [Andersson and Britton|[2000 , which features just two population compartments of 



Susceptibles {^t} and Infecteds {/t}. Since only partial immunity is available against influenza (it 
is common for an individual, especially children, to have several flu episodes in one season), the 
Recovered compartment is omitted and we assume that upon recovery individuals immediately 



pass back into the susceptible pool. Other models of endemic diseases are reviewed in |Nasell 
(2002 



Let Ml G {1,2} denote the seasonal factor at date t, with Mt = 1 representing low flu season 
and Mt = 2 high flu season. We assume a closed population of constant size N := St + It that 
represents a fixed geographic area (such as a college campus, a town, or a county). We assume 
that {Mt} forms a time-stationary Markov chain with infinitesimal generator 

Gm "= (~^^'^ ^^'^ 

In other words, holding times for {M^} in regime i are exponentially distributed with mean 
Conditional on Mt = i, {Xt} = {St, It} is a jump-Markov process involving two reaction types 

Infection: S + I 21 hi := (It + i)^; 

N (20) 
Recovery: / S'i{i)h2^ ^ /j2 := /j. 
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Thus, new infections take place according to the law of mass action Andersson and Britton 



2000 , where the infection rate is driven by the possible pairings between infected and susceptible 
individuals (assuming homogenous mixing of the full population). Additionally, we add an "im- 
migration of infecteds" rate t which can be viewed as an external reservoir of the flu (e.g. from 
travellers) that provides a constant source of additional infections. We use this term to prevent a 
stochastic fade-out of the epidemic and guarantee endemicity. There are two epidemic parameters 
= {9i, 62), interpreted as infectiousness {61) and mean recovery time (1/^2)- Since St = N — It, 
{It} summarizes the epidemic state and we omit St from further discussion. 

For simplicity, we assume that seasonal variations affect only the contact rate 9i{Mt), so that 
^2 (Aft) = 02 is constant. We moreover assume that the effect on the contact rate is multiplicative, 

9i{2) = {l + SF)ei{l), 

for a known seasonality impact ratio SF > 0. This is meant to model the case where the 
seasonality increases probability of susceptibles becoming infected (due e.g. to cold weather) but 
has no impact on the severity of the flu once infected. 

As described above, given M, the key state {It} forms a recurrent Markov chain on {0, 1, . . . , N}. 
Using the notation in Section [2| we have Rk G {—1, 1} with 

ai(J,i) :=gi(i) ^^+^^y a_i(I,i):=02/, (21) 



for i = 1, 2, and 



N 



9iii)iI+L)iN~I)/N 



if r = 1; 
if r = — 1. 



fril,i) = l e,m+^){N-i)/N+e,i - ■ (22) 

I ei{{){I+i,){N-I)/N+82l 



Since {Mt} only takes on two values, its posterior is described by the one-dimensional proba- 
bility process 

:= F{Mt = 2\fI). 

Applying Proposition [T] and using the fact that 62 is independent of {Mt} leads to the following 
simple dynamics of Y^T 

Corollary 1. The evolution o/{n^} is characterized on each t S [t^'iT^+i) hy 

^ =(1 - 2n?)^2i - SF . ^i^^^^^^#^n?(i - n?), (23) 



and 



{i+SF)nl^_ 




(24) 



An explicit solution to (23) can be obtained by computing the eigen-pairs of the corresponding 
matrix A(I). 

The particle learning Algorithm [l] can be straightforwardly applied in this model using particles 
of the form (mj''^ , s[''^) G {1, 2} xM^. Since 02 is not modulated, its posterior is in fact deterministic 
given J^/, i.e. the same for all particles and only the two sufficient statistics of 9i need to be 
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recorded: 

p (9i\Tl, {m^J\0 < s < t}) = Ga (ai + Nl,h{t)) ; 

p = Ga{a2 + Nlb2 + J h ds); 

where sq = (ai, 02, 61, 62) is the original Gamma prior of 0. 
4.1 Illustration 

To illustrate the above algorithms, this section presents several numerical experiments. Table [l] 
summarizes the parameters used. We consider a single flu season lasting 9 months (approximately 
September through May), which includes on average a 6- month long high-season (so that fi2i — 
365/2). Such a model is meant to capture a single seasonal cycle as is commonly done by US 
epidemiological agencies on a Fall-Spring basis, see Figure [T| 

We consider a fixed population of = 10, 000 individuals with an initial Iq = 50 infecteds. 
During the low season, Oi{l) < O2 meaning that the epidemic would on its own fade out; the disease 
remains endemic through contact with outside infecteds, with a carrying capacity (i.e. the long-run 
expected level of Infecteds freezing environmental fluctuations) of about \\m.t^oo IE[/t|Ms = IVs] ~ 
50. In the high season, 9i{2) > 62 so that an outbreak begins. Even though the infectiousness rate 
increases by just SF = 15%, the resulting carrying capacity jumps to over 500, i.e. more than 5% 
of total. This illustrates that even small changes in the contact rate can have a dramatic effect on 
the equilibrium disease incidence. In line with common estimates, we use average infectiousness 
period of about 4 days, 62 ^ 0.25. 



Table 1: Parameter values used. All rates are daily. 



Parameter 


Meaning 


Value 




Transition rate to high season 


6/365 




Transition rate to low season 


2/365 


02 


Recovery rate 


0.25 


^i(i) 


Low-season infectiousness 


0.235 


01(2) 


High-season infectiousness 


0.27025 


SF 


Seasonality effect 


0.15 


L 


Immigration of outside infecteds 


2 


N 


Population size 


10,000 


T 


Time horizon (days) 


273 




Initial Condition 


(0,50) 


(ai(0),6i(0)) 


Initial Priors for 9i 


(25,100) 


(02(0), 62(0)) 


Initial Priors for O2 


(25,100) 



Figure [2] shows a sample trajectory of the infected population count over the nine months 
in conjunction with the underlying seasonal factor {Mf}. In this scenario, starting with low- 
season, high season {Mt = 2} begins on day 61 and ends on day 182, lasting just over four months 
(compared to average high season length of 1/2-year). The seasonality effect is clear, as soon after 
the beginning of the high season, {It} begins an upward trend which is reversed once Mt = 1 
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again. Nevertheless, we observe a lot of stochastic fluctuations against these main trends (e.g. a 
significant decrease in It around day 100). Overall, there were a total of 10051 infections recorded 
over the period, corresponding to roughly each member of the population becoming infected once, 
in line with observed statistics on influenza. Peak number of infecteds was 335 on day 164. 

Starting with a rather vague prior for 0, Figure [3] shows the sequential Bayesian inference of 
assuming that the trajectory of {Mt,It} shown in Figure [2] is fully observed. We note that the 
posterior means apparently converge to the true values and the posterior credibility interval (CI) 
narrows at roughly a hyperbolic rate over time. As data is accumulated, the oscillations in the 
posterior distributions decrease quickly. 

Figure |4] presents the results from the other nested model (b), namely assuming known param- 
eters G but unobserved seasonal factor {Mf}. In Figure [4] we show the posterior probability 11^ 
of the high season over the same trajectory of {It} shown in Figure [2] We note that generally 
the filter is able to well-identify the present seasonality effect and responds very quickly when 
{Mf} changes (see the very sharp drop around t = 185). The dynamic lag between change in 
the true {Mt} and the response by the filter is on the order of 10-20 days which is quite fast 
and would be difficult to identify with a "naked eye" looking at the trajectory of {It}- At the 
same time, the filter 11^ is highly sensitive to the local behavior of {It} making it very noisy. For 
example the mentioned drop in infecteds around t = 100 causes posterior likelihood of high season 
to decrease from over 95% to as low as 35%, albeit with a sharp reversal once the upward trend 
is re-established. The underlying piecewise-deterministic behavior of {11^} (see Proposition [l]) is 
highlighted in the inset figure which clearly shows the discrete upward jumps of {11^} when new 
infections are recorded. 

Finally, Figure [5] shows the output from running the SMC algorithm for joint inference of 
{Mt,@) using J = 5000 particles. Uncertainty about both parameters and seasonal factor makes 
the posterior credible intervals wider compared to Figure [3j In terms of the posterior probability 
of the high season, the resulting IIj is less volatile compared to Figure [4] and responds slightly 
slower to underlying regime shifts. 

The implemented instance of Algorithm [T] is somewhat computationally intensive since it re- 
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Figure 3: Posterior mean and 95% credibility interval of = (^1,^2) over the sample path of 
{Mt,It} in Figure [2j Solid horizontal lines indicate the true values used. 



quires repeated simulation of paths of {Mt} and evaluation of the predictive and conditional 
likelihoods over each interval [Tk,Tk+i) (so over 20,000 times in the Figures shown) and for each 
particle (J = 5000). However, we note that all these computations are exact, so the only noise 
present is from Monte Carlo re-sampling. Running time to generate Figure [5] is about two minutes 
on a typical 2011 laptop. 



4.2 Comparison to Other Inference Methods 

From the time-series estimation point of view, the stochastic systems we consider are characterized 
by long time-series and short inter-event periods. It is well known that the basic challenge 
of Sequential Monte Carlo over long horizons is particle degeneracy since repeated re-sampling 
necessarily throws some information away and cuts down on diversity of the empirical particle 
cloud. Degeneracy is exacerbated when one is required to estimate constant parameters. Without 



further steps, the basic bootstrap filter of Gordon et al 1993| will degenerate, almost surely as 
t — )• 00, to a point mass estimate of the posterior density. A popular simple solution is the |Liu 



and West 2001 algorithm (henceforth LW) that introduces adjustment moves to particle versions 
gU) of O. We implemented LW as a comparison to the presented PL algorithm and found that 
degeneracy is still prevalent over the second half of the season even with as many as J = 5000 
particles. As such, use of the sufficient statistics for 0-posteriors is crucial for HMSKMs. 

We also implemented the Storvik [2002 filter. Compared to the PL filter, its main difference is 
that Storvik [2002 applies propagate-resample steps, and as such does not require analytic form 
of the predictive likelihood (17) in Step 4 of Algorithm [Tj Instead, one simply uses 



w^J oc exp 



Tfc 



a(/„m(^-))ds) •ag(/r,-,m(J) 



Tfe-l 



~ s. 



where the propensity rates a's use the sampled particle versions 6^^^ 
rithm runs slightly faster since it does not need to evaluate the matrix exponential in (|23 ) . 



Tk-l- 



The Storvik algo- 



To quantitatively compare the performance of the described SMC algorithms, we started by 
creating a benchmark using a PL filter with 2 • 10^ particles. We then generated 100 runs of 
the PL, Storvik and LW schemes each using J = 2000 particles and residual resampling for 



14 




60 120 180 240 

Time t (days) 



Figure 4: Posterior probability nf = F{Mt = 2| J"/) over the sample path of {It} in Figure 
assuming known {61,02). For comparison we also show the respective true trajectory of {Mt}. 



the realization of {Mt,It} in Figure [2] Because in the case study the second reaction rate 62 
is independent of {Mt}, its sufficient statistic is the same for all particles and is computed 
exactly by both Storvik and PL algorithms for any filter size J. On the other hand, the estimate 
of 9i is highly sensitive to correct tracking of {Mt} over time. We first compare the 95% coverage 
probabilities with respect to the true 9i, i.e. how frequently does the true value 61 = 0.235 belong 
to the corresponding 95% posterior CI obtained from SMC. The results for the two time-points 
of Ti = 120 and T2 = 270 days are summarized in Table [2] We find that the PL algorithm 
performs best; the Storvik algorithm has somewhat higher MC errors but still maintains particle 
diversity. The LW algorithm posteriors start to collapse mid- way through the data and completely 
degenerate by the end, so that their coverage probabilities are nil. Figure [6] further shows that 
the posterior 95% CI of the PL algorithm includes fewer outliers, in other words nearly all runs 
of the algorithm recover the correct posterior. 



In Table 2 we also compare the standard error E 



|n?-n?| 



of the posterior probability of 



the high season 11^ with respect to the benchmark filter IIj. We observe that compared to PL, 
the Storvik scheme is more prone to "losing track" of Mt, in other words the Monte Carlo runs 
appear to be more leptocurtic. The LW algorithm in fact usually tracks Ilf well but also had 
several completely failed runs and clearly suffers from particle degeneracy. Overall, this analysis 
confirms our preference for PL; its advantages can be compared to the improvement from the 
bootstrap to the auxiliary particle filter in the classical SMC setup. 



4.3 Discussion 

The presented model is clearly stylized and would not be able to capture all the features of real 
epidemics. Since both and {Mt} are assumed unobserved, it may still be possible to fit real 
data even under such model mis-specification. Nevertheless, in this section we discuss some of 
adjustments that may be made for achieving further realism. 

While year-to-year epidemics arise in different times, there are clear patterns which imply that 
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Figure 5: Joint inference of {Mt,Q) over the trajectory of {It} in Figure [2] Left panel shows 
the posterior probability of the high season 11^ = F{Mt = 2|J^/). The other two panels show the 



posterior median and 95% credibility interval of the two parameters 0i,^2- The SMC algorithm 
used J = 5000 particles. All other parameters are from Table [TJ 




Figure 6: Posterior quantiles of 9i at T2 = 270 days using the PL and Storvik algorithms with 
J = 2000 particles. The histograms show the median and 95% CI quantiles across 100 runs of 
each algorithm. The true value of 9i is indicated by the solid horizontal line, the dashed lines 
indicate the corresponding quantiles from a benchmark run of PL with 20000 particles. 
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Table 2: Comparison of SMC algorithm performance. All algorithms used J = 2000 particles 
and we took Ti = 120, T2 = 270 days for the path of {Mt,It} in Figure [2} The LW algorithm 
had a tuning parameter oi h = 0.97. 











PL 


Storvik 


LW 


95% Cov. Prob 


of 


6*1 at 


Ti 


0.96 


0.87 


0.32 


95% Cov. Prob 


of 


6*1 at 


T2 


0.74 


0.65 


0.01 


Std Error of 


at 


Ti 




0.200 


0.272 


0.455 


Std Error of 


at 


T2 




0.056 


0.105 


0.037 



assuming time-stationary transition rates of {Mt} is not reasonable. Our general setup allows for 
Guit) in ([2]) to be time-dependent and could be used to capture these patterns. Other calendar- 



year effects, such as the impact of the school-year start He et al 2010 , can be deterministically 



added. A more serious concern is the assumption of known seasonal transition rates Hij] however 
in our 2-state example those may be reasonably well-calibrated using historical data such as that 
used in Figure [Tj Our numerical experiments also show that mis-specification of /Xjj is not a 
serious problem for sequential inference of . 

The model ( |20[ ) assumes that there is a single infectiousness parameter Oi which remains 
constant throughout, such that the actual contact rates are of the form ai{It,Mt) = 9i(l + 
5Fl{^j=2})^i(-^i)- This representation is for convenience only; it is straightforward to consider 
the case where we separately carry along priors for each 9i{i). We could also consider the case 
where each new season leads to a "fresh" 9i (e.g. from a new strain of the pathogen), i.e. the contact 
rate on each interval [cj^, cj^+i), where (a^) are the transition epochs of {Mt} is (£)(Mo-^) ~ p{0). 

(i) 

In that case, the sufficient statistics are simply reset to the original prior when the correspond- 
ing particle copy of mp^ changes states. However, the difficulty with this method is that 0i(i)'s 
are not independent. For instance, to have the interpretation of Mt = 2 being the high season, we 
require 6i{2) > 6*1(1). This precludes the simple specification of the respective marginals through 
a Gamma prior. The assumption of a constant seasonal factor SF is a convenient work-around 
which enforces the epidemiological meaning of seasonality. 

Our choice of a two-compartment SIS model was due to its simplicity. Since our methods can 
handle any HMSKM, it is immediate to extend to further compartments (e.g. an SEIRS model 
with Xt = {St, Et, It, Rt) that would further include the Exposed compartment of individuals 
who are infected but not infectious, and the Recovered compartment for individuals who have 
temporary immunity) or multiple seasonal regimes (e.g. adding a third "pandemic" regime to 
capture outbreaks like 2009 HlNl influenza). By modifying the propensity functions hq{t, Xt) one 
can also reflne the modeling of population mixing He et al|2010 , incorporate external immigration 
of individuals, or include age-structured populations. 

Finally, an extra possibility is to allow two-way feedback between the epidemic state {Xt} 
and the seasonal factor {Mt}. Thus, beyond the modulation of the transition times of X by 
M, we can also introduce effect of X on M through Gm = Gujit, ^t)- In other words, the 
seasonality dynamics are themselves affected by the epidemic. For instance, a large outbreak 
could be due to a long-surviving pathogen which in turn prolongs the expected length of a high- 
season. Alternatively, one could consider the case where each infection increases the chances of 



a genetic mutation of the pathogen, thereby decreasing /i2i. We refer to Ludkovski 2012a for 
SMC algorithms to address this possibility. 

The most severe constraint of our model is the assumption of fully observing {It}- In principle 
this could be achieved by exhaustive monitoring of all individuals' status. More realistically. 
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we view this model as an idealized case of biosurveillance which contends with extrinsic model 
uncertainty regarding G and {Mt} while eschewing missing data. This then provides a useful 
benchmark to analyze data collection quality. We refer to Ludkovski and Niemi [2011 for a 



related setup with discrete-time observations based on binomial sub-sampling which also admits 
a PL algorithm. 



5 Optimized Policy Response 



As mentioned in the introduction, an important use of sequential inference is for policy response. 
In this section we briefly investigate such biosurveillance decision-making using the developed 
inference tools. 

Public health policy makers act sequentially as outbreak data is collected. Their aim is to 
balance total costs which consist of morbidity costs Cm associated with the pathogen, and costs 
Ca associated with policy actions. Denote by S M the policy implemented at date t, where we 
encode the action space as a subset of the real line. We consider time-additive costs on a given 
horizon [0, T] of the form 



Cm '■= / c(It) dt and 
Jo 

Ca ■■= [ ^tdt+ V K{A4>, 
JO , "XT 



(25) 



where c{It) are the morbidity costs expressed as instantaneous rates, and without loss of generality 
we take (pt to be the instantaneous cost of the respective action. The last term in (25) with 
A(j)t = <pt — corresponds to potential additional start-up costs K{-) associated with changing 
a policy. 

In contrast to the original presentation, now (pt dynamically drives the evolution of {It, Mj}. The 
impact of policy actions can be either direct or indirect. Direct actions influence the transition 
rates, whereby G = Q{(j)t)- For example, quarantine can cut down infectiousness rates in a 
population. Indirect actions influence the transition matrix Cm = GMi4't) of {Mt}, e.g. making 
low-season regime Mt = 1 more likely. These can be interpreted as prophylactic measures that 
mitigate the seasonal effect and contain the disease at its baseline morbidity. In either case, since 
applied policies must be based only on currently available information, the control is required 
to be J-^-adapted. 

We assume that the overall objective is to minimize average (expected) costs over the interval 
[0, r] across all potential dynamic policies. 



inf [Cm + XCa] 



infE"^ 



cilt) + \{(l)t + KiAcPt)} dt 



(26) 



where A is a Lagrange multiplier and to emphasize the impact of (j) we denote the resulting 
probability measure as P*^. As A is increased, the budget constraints tighten; for A = the 
optimization is solely about minimizing expected morbidity. 



The optimization problem (26) is a partially observed stochastic control problem. The presence 



of unobserved quantities makes it nonstandard; a general approach is to reduce it to a standard 
setting by passing from It to the augmented hyperstate Zt. As we have seen before, {2t} is finite 
dimensional if at least one of {Mt} or G is observed, but is infinite-dimensional in the main case 
of interest (c) requiring joint inference. Using the Markov structure, the optimal policy response 
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(/>J' at time t is a function of the posterior state Zt^ = ^{It, Zt,(l)t-), for some strategy rule 
$. Analytic treatment of such infinite-dimensional control problems is generally intractable; see 



Ludkovski and Niemi 2010 , Ludkovski 2012b for flexible numerical approximations that also 



rely on SMC. 

Rather than carry out a full optimization, we investigate below some simple heuristics for 
such rules As our case study we consider a simple example with a binary policy response, 
(pt G {0, 1} with (j)t = 1 indicating the implementation of preventive measures at date t. We 
assume that transition rates are fixed but policy makers can influence the environment {Mt} 
through 

^M,.=ri^2 -2) ^M,.=i=i^^ -8 

Thus, when counter-measures are enacted, {Mt} is much more likely to be in the low state. 
Indeed, while without any actions the high season is expected to last 6 months, with mitigation 
it will only last an average of 52/8 ~ 6 weeks. 

With an on/off decision-making, the dynamic policy rule is described through its action regions 
Dq and -Di, such that a response is initiated as soon as {It, Zt) € Di and (pt- = 0, and terminated 
as soon as {It,Zt) £ Dq an (pt- = 0. Note that we expect to have a hysteresis region where the 
existing policy, whatever it may be, is continued, since there is no sufficiently strong evidence 
to change the response. To compare, we consider three potential classes of policies that are 
distinguished by the information used: 

The infecteds-based control relies directly on It- Since start (resp. end) of outbreaks is char- 
acterized by persistent upward (resp. downward) trend, we take D^"-^'^ = {/j — xj:'^'^ > 
I},Z)„^"/'^ = {/,-xW</}, where 



is an exponentially weighted moving average of the number of infecteds using discount weight 
K. Thus, measures are initiated when It is sufficiently above its moving average (strong 
upward trend), and stopped when It —l^t'^^ is sufficiently negative. From our experiments, 
using a 14-day moving average offers a good way of capturing trends. This is a simple policy 
that requires no inference and can be seen as rule of thumb to provide response when {It} 
is growing. 

The Bayesian policy incorporates the full history of observations and the prior beliefs through 
the posterior distribution Zt. As a simple choice we consider only the posterior of {Mt} and 
analyze jj^"-y^'^ = > yf} and Dj^^-y^^ = jn^ < vr}. Thus, countermeasures are started 
once the posterior probability of {Mt = 2} is above vf (overwhelming evidence of a high 
season) , and are stopped once the posterior probability is below tt. 

The Oracle policy takes L*^*^^' = {Mt = 1}, Z)f''^' = {Mt = 2}, 01 (f>t = l{Aft=2}> i-e- the 
control is applied precisely during the high season. This is an idealized benchmark since it 
assumes that the policy-maker actually observes the latent seasonality variable. We note 
that it is still not the globally optimal benchmark since in principle it may not be worthwhile 
to respond immediately as soon as the high season begins (e.g. if It is still low then ostensibly 
Mt could revert back to its low state quickly without the need for costly interventions). 

Clearly, the above list is not exhaustive and is for illustration purposes only. An infinite variety of 
further rules can be constructed. For example, including posterior information about B is clearly 
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relevant to understand the severity of the outbreak and its Hkely future course. In general, an 
automated way to build an optimal policy is more appropriate than heuristics, albeit at a cost of 
losing some of the simplicity and intuition for the decision maker. 

Once a class of policies is chosen, the decision rule needs to be optimized by minimizing over 
the described parametric forms such as (vr, vf) or (/,/). Note that even for a fixed rule, the 
expected costs are not analytically available since the distribution of It or Mt under P*^ is not 
explicitly computable. We therefore resort to predictive analysis via Monte Carlo. Namely, fixing 
the policy rule, we simulate a large number (several hundred in our example below) of scenar- 
ios, i.e. trajectories of {It, Mt, (pt), and then average the resulting scenario costs to approximate 
E'^[Cm + XCa]. 

Simulation under the controlled measure P*^ is straightforward thanks to the strong Markov 
properties of {Zt} and is summarized in Algorithm [2j For simplicity we restrict to the case 
where (pt only changes at event times r^. For these simulations we draw the actual parameters O 
independently from the given prior p{@), i.e. assuming the model is correctly specified and then 
generate {It,Mt} using the Gillespie algorithm. 

Require: (Mq , /q , , 0o ) 
1: Sample outbreak parameters G ~ -Zq 
2: s ^ 
3: loop 

4: Simulate the next regime change date o" > s of {Mt} using the generator GMi<Ps) 

5: Simulate the next transition r > s of {It} conditional on (Mg, (ps, ©) 

6: p ^ (T A T 

7: Save M[s^p] and I[s^p] 

8: Using PL Algorithm [l] update the filter {Zt} on [s,p] 

9: Update the policy (pp ^ ^{Ip, Zp, (ps) 
10: Set s ^ p 
11: end loop 

Algorithm 2: Simulation of a controlled epidemic model 



5.1 Cost Functionals 

There are many possible summary statistics to evaluate the relative merit of different miti- 
gation strategies. Among morbidity measures, one can consider average number of infecteds 
Jq Itdt], maximum infecteds E'^[maxo<(<T -^t], or the proportion of time that It is above 
some level Ihigh, ^'^[Jq '^{it>iMgh} ^^]- include metrics regarding the response, such 

as the average length of time countermeasures are enacted IE''^[/o^ l{0(=i} dt], the number of times 
counter-measures are started ^"^[J^iKT ^{A^tT^o}]; ^tc. To illustrate, we consider the following two 
examples of cost functionals: 

ci (/,(/>):=/ It + 0.02{It- 200)1 + 50 -l^^^^iydt; (27a) 
Jo 

C2{I, cP) := / {It + 1000 • l{/,>3oo} + 200 • l{^,=i}) dt + 1400 • ^ 1{a<^,=i}- (27b) 

Jo ^^rp 

The cost functional ci has a piecewise-quadratic cost in terms of the number of infecteds It- 
Here a "soft" threshold of Ihigh = 200 infecteds is applied, as well as a basic morbidity cost 
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that is proportional to It- This ci also mildly penalizes the amount of time counter-measures 
are applied through the third term in (27a). The cost functional C2 is geared more towards 
minimizing mitigation resources. It has much higher policy costs of 200 per day and also rewards 
policy stability (i.e. avoiding too many changes in policy or "chattering"). The latter is taken 
into account in (27b) through the penalty K{A(j)t) = 1400 • l{A0t=i} which imposes a start- 
up cost of 1400 (week's worth of policy costs) each time the counter-measures are begun. In 
terms of outbreak costs, C2 considers total number of infecteds-days and additionally imposes a 
discontinuous penalty whenever It exceeds 300 (i.e. 3% of the population) which can be thought 
of as a "hard" (but high) target regarding tolerable number of infecteds. 
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Figure 7: Dynamic Bayesian control of {Mt} over a sample trajectory of the outbreak. The 
top panel shows the infecteds numbers {It}', the bottom panel shows the true seasonal factor 
{Mt} and the posterior probability of high season {11^}. Countermeasures are applied as soon as 
> 0.95 = vf and stopped once 11^ < 0.05 = vr. The intervals of action are indicated by the solid 
bars on the x-axis, and the times of policy changes are marked with stars. The PL algorithm 
used J = 5000 particles. All other parameters are from Table [Tl 



To generate outbreaks, we fix 02 = 0.25 and sample 9i ~ 6*0(1700,20 • 365), which roughly 
means 9i E [0.21,0.25]. The remaining parameters, including SF = 0.15, are from Table [!} 
Figure [T] shows the resulting Bayesian-type dynamic policy on a sample trajectory of {It}- In this 
example, counter-measures start once the probability of being in the high season is at least 95% 
and are ended once 11^ < 0.05 and It < lugh = 200; the latter is to make sure that the outbreak 
is fully contained. In the figure, the filter reacts rather slowly to the first outbreak, perhaps due 
to lower than normal numbers of infecteds at its onset. Coincidentally, once counter-measures 
are finally started, {Mt} reverts back to low season almost immediately. The second outbreak 
begins around seven months and is responded to within 20 days. At the end of the simulation, 
Mt = 2 remains in the high season. Note that the unrealistic assumption (solely for simplicity 
of presentation) that {Mt} goes back to its original transition matrix once (j)t = Q (even after 
counter-measures were applied previously) implies that multiple high seasons are likely to occur 
over the nine months. 
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(0.95,0.05) 


108 


30% 


19 


2.6 


56 


64 



Table 3: Expected costs and summary statistics of selected response strategies over 500 simulated 
scenarios. The chosen Infecteds-based rule is Z?!"-^*^ = {It—T^'^^ > I},D^^-^'^ = {It—X'f'^ < ICiIt < 
Ihigh} and the chosen Bayesian rule is D^^-y^"^ = {n| > vf} and Dj^^-y^^ = {n^ < vrn/i < Ihigh} for 
the specified thresholds. Bayesian algorithms used J = 3000 particles. The Baseline policy is to 
do nothing (pt = 0, and the Oracle policy is cpt = l{Mt=2}- Standard deviations of the computed 
expected values are in brackets. 



The performance of several mitigation policies is compared in Table [3| The table demonstrates 
that depending on the priorities of the policy makers, different mitigation strategies should be 
considered. Crucially, one must consider the trade-off between fast response and potentially 
unnecessary interventions. This trade-off is clearly observed with some strategies being more 
aggressive (and hence producing lower expected morbidity) and other strategies being more con- 
servative. For instance, the Bayesian policy with (vr, vf) = (0.01,0.8) is much more aggressive 
than a similar policy with (vr, tt) = (0.05, 0.95), as it starts counter-measures as soon as there is at 
least 80% chance of being in the high season, and continues them until the posterior probability 
drops below 1% (compared to starting at 11^ > 95% and stopping as soon as 11^ < 5%). Not 
surprisingly, it reduces average number of infecteds by nearly 15% and the average number of 
days when there are more than 300 infecteds by over 30% in comparison. This comes at a cost 
of over 40 additional days on average when countermeasures are applied. Which policy is better 
therefore depends on the weightings (cf. Lagrange multiplier A in (26)) placed on the different 
ingredients of the cost functional; here E'^[ci] is smaller for (vr, vf) = (0.01,0.8), while ^"^^[02] is 
smaller for (vr,vf) = (0.05,0.95). 

Overall, we find that strategies based on Bayesian inference seem able to outperform policies 
based on It only, though the difference is not very statistically significant. Of course, this result 
must be tempered as we did not perform an exhaustive optimization and it obviously depends 
on the cost functionals considered. We also note that strategies differ a lot in achieving similar 
results. For instance the Bayesian policy with (vr, vf) = (0.01, 0.8) has similar expected costs using 
the functional ci as the Infecteds-based policy with (/, /) = (—10,20). However, the latter is 
much more variable (over 5.3 expected policy start-ups) in time, though lasting shorter periods 
(20% less frequently). 

As a final comparison, Table [3] points out that no action at all is clearly sub-optimal and 
whatever the aims of the policy makers, some mitigation is obviously beneficial. Moreover, the 
idealized Oracle control is in fact not optimal either, since it does not account for the dynamics of 
{It} and by tracking {Mt} exactly tends to act /end too quickly. Further tailoring and refinement 
of risk metrics is obviously needed for practical use and raises a host of interesting inter-disciplinary 
questions that will be explored in upcoming works. 
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